Run with a single probability measure
Run with various measures, especially focusing on extreme plausible values
No clear principles for choosing ranges
No weighting of options
\(p(X) = p(Y) = p(X\vert Y) = 1/3\)
\(q(X) = q(Y) = q(X\vert Y) = 2/3\)
\(r =p/2+q/2\)
\(r(X\cap Y) = 5/18 \neq r(X)r(Y)=1/4\)
\(\mathsf{P}(A=B) <1\)
\(\mathsf{P}(r\vert A=a) = a \,\,\,\, \mathsf{P}(r\vert B=b) = b\)
\(\forall a,b \, \mathsf{P}(r \vert A =a, B = b) = \alpha a + \beta b\)
For any proposition \(r\), if \(A\) and \(B\) are random variables taking values in the unit interval, there is no probability measure \(\mathsf{P}\) for which these conditions hold
For any group of peers whose credences in a proposition \(X\) range from \(x\) to \(y\), the aggregated credence is within \([x, y]\)
A doctor is fairly confident that a treatment dosage for a patient is correct (.97) and considers the opinion of a colleague, whose credence is 0.96
This type of fiber occurs in 25% of carpets in this town…
…based on a random sample of 40/100/200 carpets.
Items of evidence leading to different expected values should be able to have the same weight
Items of evidence leading to the same value should be able to have different weights
In simple set up, such as Bernoulli trials, weight should increase with the number of observations
For unimodal distributions, the wider the distribution associated with a given piece of evidence, the less weight this evidence has
weights as associated with distributions
weight of evidence results from distribution weight comparison
The right path is: 011.
There are \(m=8\) possible destinations by decisions at \(\log_2(8)=3\) forks
Initially you thought the probability that it is the right one was .5
Now you know it is the right one. Surprise: \(\frac{1}{.5}=2\)
One bit of information: \(\log_2\left(\frac{1}{.5}\right)=1\)
Complete instruction: \(\log_2\left(\frac{1}{.5^3}\right)=3\)
Notice that \(\log_2\left(\frac{1}{a}\right)= - \log_2(a)\), so: \(h(x) = - \log_2 \mathsf{P(x)}\)
\(H(X) = \sum \mathsf{P}(x_i) \log_2 \frac{1}{\mathsf{P}(x_i)} = - \sum \mathsf{P}(x_i) \log_2 \mathsf{P}(x_i)\)
The expected amount of information you receive once you learn the value of \(X\).
The more informative a piece of evidence is, as compared to the uniform distribution, the more weight it has, on scale 0 to 1.
\(\mathsf{adw(posterior)} = 1 - \left( \frac{H(\mathsf{posterior})}{H(\mathsf{uniform})}\right)\)
\(\mathsf{rdw(posterior, prior)} = 1 - \left( \frac{H(\mathsf{posterior})}{H(\mathsf{prior})}\right)\)
\(\mathsf{wDelta}(\mathsf{posterior, prior}) = \vert \mathsf{adw}(\mathsf{posterior}) - \mathsf{adw}(\mathsf{prior})\vert\)
The higher-order approach